Crime vs Cancer: A Look at Death through the Ages from 1999-2012

A Data Bootcamp Project by Arman Nasim

A special Thanks to Honoary Group Member Sean Rosario

Background

Death is part of our life. Life after death. Death during life. Living and never being alive. These are the topics that consume the human attention. Between 1990 and 1999, the major networks (ABC, NBC and CBS) devoted more coverage to crime than any other topic on their nightly national newscasts. On local television news, crime consumed 30% of all news time, displacing coverage of other pressing issues. In comparison to crime, topics like government (11%), health (7%), education (4%), and poverty (2%) receive far less attention.

But the question is why? Why is crime such an interesting topic? Certainly, the crime of murder leads to death. But what about death from due to diseases—in particular, cancer. Why is crime so widely discussed, when disease is not. This project aims to look at the United States national crime rates vs the cancer death rates from 1999-2012. Moreover, I am particularly concerned with death as a topic in the media.

Now let’s get our hands dirty with the data.


In [1]:
#Data:
# import packages 
import pandas as pd                   
import matplotlib.pyplot as plt 
%matplotlib inline 

#Data was obtained through the FBI UCR site and the CDC.
#http://www.stltoday.com/news/local/columns/editors-desk/study-ranks-topics-covered-by-the-media/article_a48c5923-bf8e-5f0e-b5de-e16cc1e186b4.html
#https://ucr.fbi.gov/crime-in-the-u.s/2013/crime-in-the-u.s.-2013/tables/1tabledatadecoverviewpdf/table_1_crime_in_the_united_states_by_volume_and_rate_per_100000_inhabitants_1994-2013.xls
#https://wonder.cdc.gov/controller/datarequest/D113;jsessionid=2364C44EFFBA084DEA6121C54154CA92
#http://www.bjs.gov/content/pub/pdf/apvsvc.pdf

folder= 'data/'
excel_file= folder + 'FBIcrimdata.xls'
df1=pd.read_excel(excel_file)
df1 = df1.set_index("Year")
#print (df1)

fig, ax = plt.subplots()
df1['Pop'].plot(ax=ax)
ax.set_ylabel('Pop', fontsize=12)

#Pop is Growing, from 260 M in 1994 to 320 M in 2012.


Out[1]:
<matplotlib.text.Text at 0x11543e668>

In [2]:
fig, ax = plt.subplots()
df1['Murder'].plot(ax=ax)
ax.set_ylabel('Murder', fontsize=12)


Out[2]:
<matplotlib.text.Text at 0x1154e8908>

In [3]:
fig, ax = plt.subplots()
df1['Rape'].plot(ax=ax)
ax.set_ylabel('Rape', fontsize=12)


Out[3]:
<matplotlib.text.Text at 0x1155f8160>

In [4]:
fig, ax = plt.subplots()
df1['Violent Crime'].plot(ax=ax)
ax.set_ylabel('Violent Crime', fontsize=12)


Out[4]:
<matplotlib.text.Text at 0x1154a7b70>

In [6]:
folder= 'data/'
excel_file3= folder + 'CancerDeaths2.xls'
df3= pd.read_excel(excel_file3)
df3 = df3.set_index('Year')

#print(df3)

fig, ax = plt.subplots()
df3.plot(ax=ax)
ax.set_ylabel('Cancer Deaths', fontsize=12)


Out[6]:
<matplotlib.text.Text at 0x115af76d8>

In [7]:
folder= 'data/'
excel_file2= folder + 'CancerDeaths.xls'
df2= pd.read_excel(excel_file2)
df2 = df2.set_index('Age')

#print(df2)

fig, ax = plt.subplots()
df2.plot(ax=ax)
ax.set_ylabel('Cancer Deaths', fontsize=12)

#There were 20,852,286 total Cancer deaths between 1999 and 2012.
# In 1999, the age of 12-24 made up 24% of murder victims. Ages 25-49 made up 53%. Ages 50+ made up 12%.
# It seems that the older population seems to get cancer, while the younger 25-49 seems to get murdered more often.


Out[7]:
<matplotlib.text.Text at 0x115b86898>

In [8]:
# Curious about one thing- is Cancer killing younger people or older people after a decade

folder= 'data/'
excel_file6= folder + 'CancerDeaths3.xls'
df6= pd.read_excel(excel_file6)
df6 = df6.set_index('Age')

#print(df6)

fig, ax = plt.subplots()
df6.plot(ax=ax)
ax.set_ylabel('Cancer Deaths', fontsize=12)

# The data shows that cancer is killing more people and younger people after a period of 13 years.


Out[8]:
<matplotlib.text.Text at 0x115bf8b70>

In [9]:
#Sidebar over, back to the task.

folder= 'data/'
excel_file4= folder + 'UltData.xls'
df4= pd.read_excel(excel_file4)
df4 = df4.set_index('Year')

#print(df4)

fig, ax = plt.subplots()
df4.plot(ax=ax)
ax.set_ylabel('Deaths', fontsize=12)


Out[9]:
<matplotlib.text.Text at 0x115bf29e8>

In [10]:
folder= 'data/'
excel_file5= folder + 'MediaCov.xls'
df5= pd.read_excel(excel_file5)
df5 = df5.set_index('Topic')
print(df5)

#fig,axe =plt.subplots()
#df5.plot.pie(ax=ax)
#pie graph was not working but 2012 coverage can be seen in the data.


                     Media Coverage
Topic                              
Elections/ Politics           21.30
US Foreign Affairs            13.60
Foreign                       11.00
Crime                          6.60
Gov Agencies                   5.30
Economy                        5.00
Disasters                      4.20
Health                         3.60
Business                       3.10
Lifestyle                      3.00
Misc                           2.50
Domestic Affairs               2.30
Media                          2.30
Military                       2.30
Sports                         1.70
Environment                    1.70
Terrorism                      1.60
Celeb Entertainment            1.50
Sceince                        1.20
Race/ Gender Issues            1.10
Transportation                 1.00
Education                      0.90
Religion                       0.80
Court/ Legal System            0.40
Development                    0.01

Conclusion

We see that as the US Population grew by about 50% over the last 20 years, there was an increase in Cancer Deaths, and a decline in Violent Crime but particularly murders. In fact,in 2002/2003, Cancers Deaths superseeded the Murders and the gap continues to widen. Yet the Media coverage doesnt reflect the severity of the death rate. The Media covers Crimes 6% of the time, but only Health related issues 3% of the time.


In [ ]: